Understanding deep learning requires rethinking generalization

نویسندگان

  • Chiyuan Zhang
  • Samy Bengio
  • Moritz Hardt
  • Benjamin Recht
  • Oriol Vinyals
چکیده

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. We interpret our experimental findings by comparison with traditional models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

We describe an approach to understand the peculiar and counterintuitive generalization properties of deep neural networks. The approach involves going beyond worst-case theoretical capacity control frameworks that have been popular in machine learning in recent years to revisit old ideas in the statistical mechanics of neural networks. Within this approach, we present a prototypical Very Simple...

متن کامل

Understanding Deep Learning Generalization by Maximum Entropy

Deep learning achieves remarkable generalization capability with overwhelming number of model parameters. Theoretical understanding of deep learning generalization receives recent attention yet remains not fully explored. This paper attempts to provide an alternative understanding from the perspective of maximum entropy. We first derive two feature conditions that softmax regression strictly ap...

متن کامل

Deep Learning: Generalization Requires Deep Compositional Feature Space Design

Generalization error defines the discriminability and the representation power of a deep model. In this work, we claim that feature space design using deep compositional function plays a significant role in generalization along with explicit and implicit regularizations. Our claims are being established with several image classification experiments. We show that the information loss due to conv...

متن کامل

Generalization and Expressivity for Deep Nets

Along with the rapid development of deep learning in practice, theoretical explanations for its success become urgent. Generalization and expressivity are two widely used measurements to quantify theoretical behaviors of deep learning. The expressivity focuses on finding functions expressible by deep nets but cannot be approximated by shallow nets with the similar number of neurons. It usually ...

متن کامل

Lessons from the Rademacher Complexity for Deep Learning

Understanding the generalization properties of deep learning models is critical for successful applications, especially in the regimes where the number of training samples is limited. We study the generalization properties of deep neural networks via the empirical Rademacher complexity and show that it is easier to control the complexity of convolutional networks compared to general fully conne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1611.03530  شماره 

صفحات  -

تاریخ انتشار 2016